EmbCAB operon of mycobacteria and mutants thereof

ABSTRACT

This invention relates to the identification, cloning, sequencing and characterization of the embCAB operon which determines mycobacterial resistance to the antimycobacterial drug ethambutol. The embCAB operon encodes the proteins which are the target of action of Mycobacterium tuberculosis, Mycobacterium smegmatis, and Mycobacterium leprae for ethambutol. The present invention provides purified and isolated embC, embA, and embB nucleic acids which comprise the embCAB operon, as well as mutated forms of these nucleic acids. The present invention also provides one or more single-stranded nucleic acid probes which specifically hybridize to the wild type embCAB operon or the mutated embCAB operon, and mixtures thereof, which may be formulated in kits, and used in the diagnosis of drug-resistant mycobacterial strain. The present invention also provides methods for the treatment and prevention of mycobacterial infections. In addition, the probes of the present invention may be used to determine the susceptibility of mycobacteria to ethambutol.

STATEMENT OF GOVERNMENT INTEREST

This invention was made with government support under NIH Grant Nos. AI-37004 and AI-37015. As such, the government has certain rights in this invention.

BACKGROUND OF THE INVENTION

This invention is based upon the discovery by the inventors of the embC, embA, and embB genes that comprise the embCAB operon, and novel proteins encoded by the embCAB operon which are associated with the biosynthesis of the mycobacterial cell wall and are involved in resistance to the antimycobacterial drug ethambutol (EMB). The discovery of the embCAB operon and the proteins encoded by the operon will have important implications in the diagnosis and treatment of drug-resistant mycobacterial strains.

EMB is a selective antimycobacterial drug recommended for clinical use in 1996 (Karlson, A. G., Am Rev Resp Dis 84, 905-906 (1961)). Today, EMB remains an important component of tuberculosis treatment programs. Unfortunately, resistance to ethambutol has been described in 2-4% of clinical isolates of M. tuberculosis in the USA and other countries, and is prevalent among isolates from patients with multidrug-resistant tuberculosis (Bloch, A B., Cauthen, G M., Onorato, I M., et al. Nationwide survey of drug-resistant tuberculosis in the United States. JAMA 271, 665-671 (1994)).

EMB targets the mycobacterial cell wall, a unique structure among prokaryotes which consists of an outer layer of mycolic acids covalently bound to peptidoglycan via the arabinogalactan (Besra, G. S. & Chatterjee, D. in Tuberculosis. Pathogenesis, protection, and control (ed Bloom, B. R.) 285-306 (ASM Press, Washington D.C., 1994)). Lipoarabinomannan, another cell wall component of significant biological importance, shares with arabinogalactan the overall structure of the arabinan polymer (Chatterjee, D., et al., J. Biol Chem 266, 9652-9660 (1991)).

EMB inhibits the in vivo conversion of [¹⁴ C]glucose into cell wall arabinan (Takayama, K. & Kolburn, J. O., Antimicrob Agents Chemother 33, 143-1499 (1989)), and results in the accumulation of the lipid carrier decaprenyl-P-arabinose (Wolucka, B. A., et al., J Biol Chem 269, 23328-23335 (1994)), which suggest that the drug interferes with the transfer of arabinose to the cell wall acceptor. The synthesis of lipoarabinomannan is also inhibited in the presence of EMB (Deng, L., et al. Antimicrob Agents Chemother 39, 694-701 (1995)), (Mikusova, K., et al., Antimicrob Agents Chemother 39, 2484-2489 (1995)); again, this indicates a specific effect on arabinan biosynthesis.

Thus, there is a need for the identification and characterization of new target genes and proteins in strains of mycobacteria that exhibit resistance to drugs. This would require the identification of genes that participate in the biosynthesis of the mycobacterial cell wall and the identification of mutants of these genes encoding proteins that confer resistance to drugs. Characterization of the molecular target(s) of EMB could provide information on targets for new chemotherapeutic agents, and facilitate development of diagnostic strategies for the detection of resistant strains of mycobacteria.

SUMMARY OF THE INVENTION

The present invention is directed to the nucleic acid sequences of the embCAB operon which encode the proteins comprising the target of action of M. tuberculosis, M. smegmatis and M. leprae for ethambutol (EMB). The present invention further provides for the identification, isolation, sequencing, and characterization of these nucleic acid sequences and proteins for which they encode.

The present invention specifically provides purified and isolated wild type nucleic acid sequences of the embC, embA, and embB genes which comprise the embCAB operon, as well as mutated forms of these genes. The present invention also provides one or more single-stranded nucleic acid probes which specifically hybridize to wild type nucleic acid sequences of the genes of the embCAB operon and to mutated nucleic acid sequences of the embCAB operon, and mixtures thereof, which may be formulated in kits, and used in the detection of drug resistant mycobacterial strains.

The present invention also provides purified active embCAB proteins encoded by the genes of the embCAB operon. Also provided are antibodies immunoreactive with the protein(s) expressed by the wild type genes of the embCAB operon, and analogues thereof, as well as antibodies immunoreactive with the protein(s) expressed by the mutated genes of the embCAB operon. Additional objects of the invention will be apparent from the description which follows.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1C: FIGS. 1A through 1C set forth the organization of the emb region of mycobacteria. FIG. 1A sets forth diagrammatical representations showing that M. smegmatis, M. tuberculosis, and M. leprae (GenBank accession numbers: U46844, U68480, L78821, and Z80343) present a conserved organization over a 14 kb region; three homologous emb genes preceded by a predicted coding region (Y), and by orfX, encoding a putative protein belonging to the short-chain alcohol dehydrogenase family (X). M. Avium (GenBank U66560) is an exception, as the region contains the embAB preceded by a putative regulator (embR) (Belanger, A. E. et al. Proc Natl Acad Sci USA 21, 11919-11924 (1996)). In M. smegmatis, the region is limited by the katG and by a putative transporter of the major facilitator family (Z). Corresponding areas in M. tuberculosis and M. leprae carry sequences encoding for hypothetical proteins (I-IV). FIG. 1B sets forth the results of a primer extension analysis. A definitive transcription start site (TSS) is present 51 bp upstream of M. smegmatis embC (a purine preceded by a high GC content -10 hexamer). No TSS was identified upstream of embA (*=start codon). A possible TSS was mapped downstream of the overlapping embA stop and embB putative start codon. FIG. 1C sets forth a diagrammatical representation of the secondary structure analysis of the EmbA, EmbC, and EmbB (EMBCAB) proteins. The EMBCAB are integral membrane proteins with 12 transmembrane domains and a C-terminal globular region of predicted periplasmic location. The proposed EMB resistance determining region in EmbB (ERDR), is marked.

FIG. 2: FIG. 2 sets forth an alignment of the protein sequences of EMBCAB proteins (SEQ. ID. NOS:1-44). Mutations (*) in EMB-resistant M. smegmatis (Sm) and M. tuberculosis (Tb) involved EmbB amino acids in a highly conserved region. M. leprae (Lp) presents a glutamine at the conserved Ile-303 position. Av: M. avium.

FIG. 3: FIG. 3 sets forth a schematic of the map of the embCAB operon as found in M. tuberculosis.

FIG. 4: FIG. 4 sets forth the nucleic acid sequence of the embCAB operon of M. tuberculosis (SEQ. ID. NO:45).

FIG. 5: FIG. 3 sets forth a schematic of the map of the embCAB operon as found in M. smegmatis.

FIG. 6: FIG. 4 sets forth the nucleic acid sequence of the embCAB operon of M. smegmatis (SEQ. ID. NO:46).

FIG. 7: FIG. 7 sets forth the results of automa ted DN A s eque ncing which indicates the exact missense mutations of codon 306 of the embb nucleic acid sequence. The sequence containing the missense mutation ATG is SEQ. ID. NO:47; GTG is SEQ. ID. NO:48; CTG is SEQ. ID. NO:49; ATA is SEQ. ID. NO:50; ATT is SEQ. ID. NO:51; and ATC is SEQ. ID. NO:52.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is directed to the nucleic acid sequences comprising the embCAB operon which encode the proteins comprising the target of action of M. tuberculosis, M. smegmatis and M. leprae for ethambutol (EMB).

The present invention specifically provides purified and isolated wild type nucleic acid sequences comprising the embCAB operon. Also provided are mutated forms of these nucleic acids, such as mutated forms of the embB gene of the embCAB operon. As used herein, an "operon" is a cluster of related genes that encode for open reading frames. The "embCAB operon" as used herein consists of the embC, embA and embB genes arranged in a single operon. The "wild type embCAB operon" is herein defined as the normal form of the embC, embA, and embB genes which express gene products, and includes degenerate forms. The "mutated embCAB operon" is the mutated form of the normal embCAB operon, which contains one or more deletions, insertions, point, substitution, nonsense, missense, polymorphism, or rearrangement mutations, or a combination thereof, that may render the gene products expressed by the mutated embCAB operon resistant to EMB. As used herein, "nucleic acid" may be genomic DNA, cDNA or RNA, and may be the entire nucleic acid sequence comprising the embC, embA, and embB genes, or any portion of the sequence thereof.

The present invention specifically provides for an embCAB nucleic acid sequence isolated from M. tuberculosis. This sequence is designated MTBEMB and is set forth in FIG. 4. The present invention also provides for an embCAB nucleic acid which encodes the amino acid sequence set forth in FIG. 2. The embCAB amino acid operon sequence is designated Themba, Thembc, and Thembb.

The present invention additionally provides for an embCAB nucleic acid sequence isolated from M. smegmatis. This sequence is designated MSMEMB2 and is set forth in FIG. 6. Also provided by the present invention is an embCAB nucleic acid sequence which encodes the amino acid sequence set forth FIG. 2. The embCAB amino acid operon sequence is designated Smemba, Smembc, and Smembb.

The present invention further provides for mutated nucleic acid sequences of the embCAB operon. In a preferred embodiment of the invention, a mutated nucleic acid sequence of the embCAB operon contains one or more mutations located in codon 306 of the embB nucleic acid sequence. These mutation(s) may be deletions, insertions, substitutions, missense, nonsense, point or rearrangement mutations, or a combination thereof. In a preferred embodiment of the invention, the mutation at codon 306 is a missense mutation. As used herein, a "missense" mutation is an alteration that changes a codon specific for one amino acid to a codon specific for another amino acid. Ordinarily, the wild type embB nucleic acid sequence encodes the amino acid methionine at position 306 (Met-306). In one embodiment of the invention, codon 306 of a mutated embB nucleic acid sequence is mutated to one the following: ATA, ATC or ATT, resulting in the encoding of an isoleucine at codon 306 instead of a methionine; GTG, resulting in the encoding of a valine instead of a methionine; and CTG, resulting in the encoding of a leucine instead of a methionine.

In one embodiment of the invention, a mutation is located in codon 303 of the nucleic acid sequence of the embB gene. Ordinarily, codon 303 of the wild type embB nucleic acid sequence encodes the amino acid isoleucine (Ile-303). In one embodiment of the invention, a mutated embB nucleic acid sequence at codon 303 encodes a phenylalanine instead of an isoleucine; in another, codon 303 encodes a glutamine instead of an isoleucine.

Other mutated embB nucleic acid sequences provided by the present invention include: a missense mutation of codon 285 wherein TTC is mutated to TTA, resulting in a change from phenylalanine to leucine; a missense mutation of codon 330 wherein TTC is mutated to GTC, resulting in a change from phenylalanine to valine; and a missense mutation of codon 630 wherein ACC is mutated to ATC, resulting in a change from threonine to isoleucine.

The present invention further provides a mutated nucleic acid sequence of the embCAB operon that contains one or more mutations located in the embC nucleic acid sequence. In a specific embodiment of the invention, an embC mutated nucleic acid sequence contains a polymorphism of nucleotide 195 resulting in a change from guanine to cytosine. This polymorphism results in the substitution of valine for leucine (Leu-195) in the embC nucleic acid sequence.

The present invention also provides a mutated nucleic acid sequence of the embCAB operon that contains one or more mutations located in the embA nucleic acid sequence. In a particular embodiment of the invention, an embA mutated nucleic acid sequence contains a substitution of guanine for adenine at nucleotide 2898 of embA. Another embodiment provides an embA mutated nucleic acid sequence which contains a missense mutation of codon 5 wherein GGT is mutated to AGT, and results in a change from glycine to serine.

It is to be understood that the present invention also provides for nucleic acid sequences wherein any or all of the above described mutations coexist in the embCAB operon in any combinations thereof.

The wild type and mutant nucleic acid sequences of the embCAB operon can be prepared several ways. For example, they can be prepared by isolating the nucleic acid sequences from a natural source, or by synthesis using recombinant DNA techniques. In addition, mutated nucleic acid sequences of the embCAB operon can be prepared using site mutagenesis techniques. The amino acid sequences may also be synthesized by methods commonly known to one skilled in the art (Modern Techniques of Peptide and Amino Acid Analysis, John Wiley & Sons (1981); M. Bodansky, Principles of Peptide Synthesis, Springer Verlag (1984)). Examples of methods that may be employed in the synthesis of the amino acids sequences, and mutants of these sequences include, but are not limited to, solid phase peptide synthesis, solution method peptide synthesis, and synthesis using any of the commercially available peptide synthesizers. The amino acid sequences, and mutants thereof, may contain coupling agents and protecting groups used in the synthesis of the protein sequences, and are well known to one of skill in the art.

The present invention also provides single-stranded nucleic acid probes and mixtures thereof for use in detecting drug resistance caused by a mutated nucleic acid of the embCAB operon. The nucleic acid probes may be DNA, cDNA, or RNA, and may be prepared from the mutated and/or wild type nucleic acid sequences comprising the embCAB operon. The probes may be the full length sequence of the nucleic acid sequences comprising the embCAB operon, or fragments thereof. Typical probes are 12 to 40 nucleotides in length. Generally, the probes are complementary to the embC, embA, or embB gene coding sequences, although probes to introns are also contemplated. The probes may be synthesized using an oligonucleotide synthesizer, and may be labeled with a detectable marker such as a fluorescence, enzyme or radiolabeled markers including ³² P and biotin, and the like. Combinations of two or more labeled probes corresponding to different regions of the embCAB operon also may be included in kits to allow for the detection and/or analysis of the embCAB operon by hybridization.

Specifically, the nucleic acid sequences of the embCAB operon may be used to produce probes capable of identifying the nucleic acids in mycobacteria which encode EMB resistance, which probes can be used in the identification, treatment and prevention of diseases caused by microorganisms, to assess the susceptibility of various mycobacterial strains to EMB, and to determine whether various drugs are effective against mycobacterial strains.

The present invention also provides purified active embCAB proteins, encoded by the embCAB operon. The proteins may be expressed by the wild type or mutated nucleic acid sequences of the embCAB operon, or an analogue thereof. As used herein, "analogue" means functional variants of the wild type protein, and includes embCAB proteins isolated from bacterial sources other then mycobacteria, as well as functional variants thereof. The proteins may also be isolated from native cells, or recombinantly produced.

The present invention also provides antibodies immunoreactive with the proteins expressed by the wild type embCAB operon, and analogues thereof, as well as antibodies immunoreactive with the proteins expressed by the mutated nucleic acid sequences of the embCAB operon. The antibodies may be polyclonal or monoclonal and are produced by standard techniques. The antibodies may be labeled with standard detectable markers (e.g. chemiluminescent detection systems and radioactive labels such as ¹²⁵ I) for detecting the wild type and mutated embCAB operons. The antibodies may also be presented in kits with detectable labels and other reagents and buffers for such detection.

The present invention also provides a method for detecting the presence of a microorganism exhibiting drug resistance in a subject, comprising detecting the presence of a mutated nucleic acid sequence of an embCAB operon of the subject. Specifically, the method determines whether a mycobacterium is resistant to ethambutol caused by a mutation in the nucleic acid sequence of the embCAB operon by detecting the presence of a mutated nucleic acid sequence of the embCAB operon in nucleic acid of the subject. The method may be used to determine whether persons in the population at large carry diseases resistant to drugs. This method is also useful for diagnosing drug resistant diseases. In a preferred embodiment of the invention, the nucleic acid sequences of the embCAB operons are identified and characterized to determine mycobacterial strains exhibiting resistance to EMB. In this embodiment, the nucleic acid probes employed in the method are mutant nucleic acid sequences of the embCAB operon. These mutant nucleic acid probes may contain one or more deletion, insertion, point, substitution, nonsense, polymorphism, missense, or rearrangement mutations. The mutant nucleic acid probes may be single-stranded, and labeled with a detectable marker.

Non-limiting examples of mycobacteria which can be tested for resistance using the method of the present invention include Mycobacterium tuberculosis, Mycobacterium avium, Mycobacterium smegmatis, Mycobacterium bovis BCG, Mycobacterium leprae, Mycobacterium africanum, and Mycobacterium intracellulare.

The presence of mutated nucleic acid sequences of the embCAB operon may be detected by procedures known to those skilled in the art including, but not limited to, standard nucleic acid sequencing techniques, restriction enzyme digestion analysis, hybridization with one or more probes hybridizable to the mutated and/or wild type sequences of the embCAB operon using standard procedures such as Southern Blot analysis, polymerase chain reaction using sense and antisense primers prepared from the mutated and/or wild type nucleic acid sequences of the embCAB operon, and combinations thereof.

The presence of the mutated nucleic acid sequences of the embCAB operon also may be detected by detecting expression of the protein product of the embCAB operon. Such expression products include both MRNA as well as the protein product itself. MRNA expression may be detected by standard sequencing techniques, hybridization with one or more probes hybridizable to the mutated and/or wild type embCAB operon mRNA using standard procedures such as Northern blot analysis, dot and slot hybridization, S1 nuclease assay, or ribonuclease protection assays, polymerase chain reaction using sense and antisense primers prepared from the mutated and/or wild type nucleic acid sequences of the embCAB operon, and combinations thereof. The protein may be detected using antibodies to the proteins expressed by the mutated embCAB operon and/or the wild type embCAB operon by procedures known in the art including, but not limited to, immunoblotting, immunoprecipitation, solid phase radioimmunoassay (e.g. competition RIAs, immobilized antigen or antibody RIAs, or double antibody RIAs), enzyme-linked immunoabsorbant assay, and the like.

The present invention also provides for a method of assessing the susceptibility of a mycobacterium to EMB in a clinical sample comprising isolating the mycobacterial chromosomal DNA from a clinical sample, preparing oligonucleotides utilizing the wild-type or mutant embCAB operon nucleic acid sequence, amplifying the region of the embCAB operon from the clinical sample, and determining whether a mutated embCAB operon exists in the mycobacterial strain in the clinical sample, the presence of a mutation indicating that the mycobacterial strain is resistant to EMB.

The mycobacteria that may be assessed by this method of the present invention include, but are not limited to, Mycobacterium tuberculosis, Mycobacterium avium, Mycobacterium smegmatis, Mycobacterium bovis BCG, Mycobacterium leprae, Mycobacterium africanum, and Mycobacterium intracellulare.

Non-limiting examples of clinical samples that may assessed by the methods of the present invention are urine, feces, blood, serum, mucus, cerebrospinal fluid, and any mixture thereof.

In a preferred embodiment of the invention, the determination of whether a mutation exists in the nucleic acid sequence of the embCAB operon is performed using single strand conformation polymorphism analysis.

The present invention also provides for a method of treating a mycobacterial infection in a subject by obtaining anti-DNA or anti-RNA nucleic acid sequences capable of inhibiting the mRNA activity of the embCAB operon of a mycobacterium, utilizing a wild type or the mutant nucleic acid of the embCAB operon; and administering an amount of said nucleic acid sequences, either alone or in combination with other compositions to treat the mycobacterial infection in a subject.

The anti-DNA or anti-RNA nucleic acid sequences employed in the method may be mutant or wild-type nucleic acid sequences of the embCAB operon. The mutant nucleic acid sequence may contain one or more deletions, insertions, substitutions, missense, nonsense, polymorphisms, point, or rearrangement mutations. The mutant nucleic acid sequence may be single-stranded, and labeled with a detectable marker.

Non-limiting examples of infections that can be treated using the methods of the present invention include those caused by mycobacteria selected from the group consisting of Mycobacterium tuberculosis, Mycobacterium avium, Mycobacterium smegmatis, Mycobacterium bovis BCG, Mycobacterium leprae, Mycobacterium africanum, and Mycobacterium intracellulare.

The nucleic acid sequences of the present invention are administered in conjunction with a suitable pharmaceutical carrier. Representative examples of suitable carriers include, but are not limited to, mineral oil, alum, and synthetic polymers. Vehicles for vaccines are well known in the art and the selection of a suitable vehicle is deemed to be within the scope of those skilled in the art from the teachings contained herein. The selection of a suitable vehicle is also dependent on the manner in which the nucleic acid sequences are to be administered. The nucleic acid sequences may be administered orally, enterally, subcutaneously, intraperitoneally, intravenously, or intranasally. Accordingly, as used herein, "subject" may be an embryo, fetus, newborn, infant, or adult. Further, as used herein "treating" is contacting a mycobacterium with the nucleic acids of the present invention, alone or in combination with other compositions.

The present invention additionally provides for the use of the nucleic acid sequences of the embCAB operon of the present invention as vaccines, or to improve existing vaccines.

Non-limiting examples of mycobacterial infections that can be treated using the vaccines of the present invention include those caused by mycobacteria selected from the group consisting of Mycobacterium tuberculosis, Mycobacterium avium, Mycobacterium smegmatis, Mycobacterium bovis BCG, Mycobacterium leprae, Mycobacterium africanum, and Mycobacterium intracellulare. For example, M. tuberculosis complex strains that are resistant to EMB often have reduced virulence and can be administered as vaccines. In addition, mutated genes of M. tuberculosis and M. bovis can be added to BCG or tuberculosis vaccines to provide attenuated mutant tuberculosis vaccines. These vaccines can be used to treat and prevent a wide variety of diseases, including tuberculosis, human immunodeficiency viral infection, polio, leprosy, malaria, tetanus, diphtheria, influenza, measles, mumps, hepatitis and rabies.

The present invention is described in the following Experimental Details Section, which is set forth to aid in an understanding of the invention, and should not be construed to limit in any way the invention as defined in the claims which follow thereafter.

Experimental Details Section

A. Materials and Methods

M. smegmatis mc² 6 (Snapper, S. B., Melton, R. E., Mustafa, S., Kieser, T. & Jacobs Jr., W. R. Isolation and characterization of efficient transformation mutants of Mycobacterium smegmatis, Mol Microbiol 4, 1911-1919 (1990)) isogenic mutants were selected on Middlebrook 7H11 (Difco Laboratories, Detroit Mich.) supplemented with oleic acid, bovine serum albumin, dextrose and catalase (OADC, Carr-Scarborough Microbiologicals) by stepwise exposure of 10⁸ bacteria to EMB [2,2'-(ethylenediimino)-di-1-butanol] (Sigman, Buchs, Switzerland). First level mutants, selected on plates containing EMB 2.5 μg ml⁻¹, were replated on EMB 20 μg ml⁻¹ to obtain second level mutants. Highly resistant (third level) mutants were obtained by plating second level mutants onto EMB 100 μg ml⁻¹. Minimum inhibitory concentrations (MICs), indicating the minimal concentration of an antibiotic that must be achieved at the site of infection to inhibit the growth of the microorganism, were determined in microtiter plates containing Middlebrook 7H9/OADC. Clinical isolates of M. tuberculosis were tested in 7H10/OADC or Bactec 12B at breakpoint concentrations of 5 or 7.5 μg ml⁻¹, respectively.

Chromosomal DNA was obtained from high-level resistant mutant M. smegmatis IMM30 after generation of protoplasms through overnight exposure to cycloserine 10 mg ml⁻¹, and ampicillin 1 mg ml⁻¹. Fragments (35-45 kb) obtained after partial digestion with Sau3A were ligated in Xba1-BamH1 pYUB415 arms (W. R. Jacobs Jr.), in vitro packaged (Gigapack III Gold, Stratagene, La Jolla Calif.) and transduced into E. coli colonies, selected on Luria Bertani with ampicillin 50 μg ml⁻¹, were pooled, their cosmids extracted, and DNA (1 μg) electroporated into electrocompetent M. smegmatis mc² 155 (Snapper, S. B., Melton, R. E., Mustafa, S., Kieser, T. & Jacobs Jr., W. R. Isolation and characterization of efficient transformation mutants of Mycobacterium smegmatis, Mol Microbiol 4, 1911-1919 (1990)). Transformants were selected on 7H11/OADC containing hygromycin 50 μg ml⁻¹ and 2.5, 20, or 100 μg ml⁻¹ EMB. Complementing cosmids were extracted from M. smegmatis protoplasms and transformed into E. coli StbI2 (gibco BRL, Basel, Switzerland). To identify the minimal-size DNA fragment conferring EMB resistance, a Sau3A partial digestion of one of the cosmids (pIMM50) was cloned into pMD31 (Levin, M. E. & Hatfull, G. F. Mycobacterium smegmatis RNA polymerase: DNA supercoiling, action of rifampicin and mechanism of rifampicin resistance. Mol Microbiol 8, 277-285 (1993)), electroporated into M. smegmatis mc² 155, and resistant transformants were identified on 7H11/OADC containing kanamycin 50 μg ml⁻¹ and 2.5, 20, or 100 μg ml⁻¹ EMB. Plasmid preparation was performed as described above and the smallest clone conferring the phenotype (pIMM99) was selected for sequencing. In addition, the 40 kb insert of pIMM50 was mobilized via PacI digestion to the integrative vector pYUB412 (W. R. Jacobs Jr.), and in vitro packaged. The resulting construct, pIMM52, was electroporated into M. smegmatis mc² 155 to assess its ability to confer an EMB-resistant phenotype as a single copy.

Sequencing was performed after shotgun fragmentation (sonication) of the emb region. Accuracy was assured by a sequence redundancy of 6. Identification of ORFs and prediction of coding sequences was done by using Mac Vector software (Kodak, New Haven Conn.) after construction of an M. smegmatis codon usage table. Sequence databases (GenBank, The Institute for Genomic Research, and MycDB World Wide Web sites) were searched for homologies. Topology prediction, and presence of palindromic and stemloop sequences were performed by using PredictProtein EMBL (Rost, B. & Sander, C. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19, 55-72 (1994)), and the University of Wisconsin GCG package.

Mapping of the emb region on the M. tuberculosis chromosome was performed by probing an ordered cosmid library (Philipp, W. J. et al. An integrated map of the genome of the tubercle bacillus, Mycobacterium tuberculosis H37Rv, and comparison with M. leprae, Proc Natl Acad Sci USA 93, 3132-3137 (1996)) with a M. tuberculosis PCR fragment amplified using degenerated primers based on the M. smegmatis embA sequence. A positive clone (Y457) was mapped by restriction analysis and hybridization, and the appropriate fragments subcloned and sequenced using a shotgun approach. The 40 kb insert, was mobilized via PacI digestion to shuttle vector pYUB415, and in vitro packaged. The resulting construct, pIMM128, was electroporated into M. smegmatis mc² 155 to assess its ability to confer an EMB-resistant phenotype. The embA probe was also used in Southern blot analysis to determine the presence the emb gene cluster, or operon, in other mycobacteria (M. vaccae, M. terrae, M. nonchromogenicum, M. kansasii, M. marinum, M. intracellulare, M. chitae, M. chelonae, M. aurum), using Staphylococcus aureus and E. coli as negative control.

Total RNA was prepared from M. smegmatis mc² 155 by 5 sec sonication (Branson Sonifier 250 at constant output of 3) of pellets in Triazol (Gibco). After addition of chloroform, the organic phase was separated, and RNA was precipitated. For transcription start analysis, primers complementary to the coding regions of embC (5'-TCGCCATCAGCGTGCCGAGA) (SEQ. ID. NO:53), embA (5'-TCTCCTCCACCGGGAGCAGT) (SEQ. ID. NO:54), and embB (5'-GTTGCGCACGATCACGTCGA) (SEQ. ID. NO:55) were end-labelled by using [γ-32P]-ATP and T4 polynucleotide kinase. Total RNA (10 μg) was used in a primer extension reaction loaded onto a 6% polyacrylamide gel together with the corresponding sequencing reaction.

Mutation analysis of pIMM99, as well as of mutant strains of M. smegmatis, was performed by detailed sequence analysis of promoter, intergenetic, 600 bp of C-terminal sequence and the second cytoplasmic loop for all three emb genes. M. tuberculosis H37Rv and a collection of 70 clinical isolates were evaluated by targeted sequencing or by SSCP (single strand conformation polymorphism) of the described regions. Automated SSCP screening for embB mutations in M. tuberculosis was done as reported (Telenti, A., et al. Antimicrob Agents Chemother 37, 2054-2058 (1993)) by using fluorescent-labelled primers:

TE13f (5'-CAATTGCCCAGCTCCTCCTC) (SEQ ID NO:56) and

TE14f (5'-ACAGACTGGCGTCGCTGACA) (SEQ ID NO:57).

B. Results and Discussion

Resistance to EMB was used as a tool to identify genes participating in the biosynthesis of the mycobacterial cell wall, which led to the identification of the embCAB operon. The development of resistance to EMB was evaluated in the rapidly growing M. smegmatis (EMB MIC=0.5 μg/ml). Spontaneous EMB-resistant mutants of M. smegmatis were selected by stepwise exposure to increasing concentrations of the drug. Attainment of high level EMB resistance required three independent mutations at an approximate frequency of 10⁻⁷ for each step. Low (emb-1) and moderate level (emb-1 emb-2) resistance mutants exhibited EMB MICs of 20 and 100 μg/ml, respectively. High-level resistant mutants (emb-1 emb-2 emb-3) were not inhibited by extremely high concentrations of EMB (MIC>256 μg/ml). These results indicated that three consecutive genetic events, involving an accumulation of multiple mutations in one or more genes had taken place.

To identify the genes conferring EMB resistance, a genomic library from a high level EMB-resistant mutant of M. smegmatis was introduced into wild type M. smegmatis mc² 155 (Snapper, S. B. et al., Mol Microbiol 4, 1911-1919 (1990)). Four overlapping cosmids were identified which conferred a resistant phenotype. The minimum size fragment capable of conferring EMB resistance was 9 kb (pIMM99). Sequencing of pIMM99 plus 7 kb of upstream M. smegmatis sequence revealed three homologous open reading frames each approximately 3200 bp (embC, embA, embB), and four additional potential coding regions (see FIG. 1A). An M. smegmatis probe was used to identify the corresponding cosmid from an ordered library (Philipp, W. J. et al., Proc Natl Acad Sci USA 93, 3132-3137 (1996). Sequencing of the M. tuberculosis homologue revealed the presence of complete copies of the emb genes. The M. leprae emb region was identified in one of the sequenced cosmids from the M. leprae genome project (Eiglmeier, K., et al., Mol Microbiol 7, 197-206 (1993)). Interestingly, while these mycobacteria presented a conserved organization over a 14 kb region, the emb operon of M. avium (Belanger, A. E. et al., Proc Natl Acad Sci USA21, 11919-11924 (1996)) only contains the embaAB genes (see FIG. 1A). The emb genes were found by Southern blot analysis to be present in all mycobacteria tested (results not shown). However, nucleotide and protein searches revealed no significant homology of the emb to any non-mycobacterial submissions to public domain databases.

The emb genes represent examples of gene duplication, encoding proteins with amino acid similarities in the range of 61-68%. The emb region is organized as an operon, a concept supported by primer extension data indicating the polycistronic nature of the transcripts (FIG. 1B), and by the finding that there is almost no untranslated intercistronic region between the emb genes, indicating a tight translational coupling typical for proteins with coordinated expression. However, the presence at the junction between embA and embB of a potential secondary promoter in M. smegmatis, and of a stem-loop structure in M. tuberculosis, indicate that the last emb gene could be differentially regulated. Topology analysis suggests that the EmbCAB are integral membrane proteins with 12 transmembrane domains and a C-terminal globular region of ca. 400 amino acids of predicted periplasmic location (see FIG. 1C).

The feature of the emb products as membrane proteins with coordinated expression is consistent with a role in the synthesis of exopolysaccharides, where the several homologous genes would be responsible for the synthesis of the various arabinan linkage motifs. Indeed, compelling evidence for the role of the EMB proteins as arabinosyl transferases responsible for the polymerization of arabinose into the arabinogalactan has been recently presented by Belanger, et al. (Belanger, A. E. et al. Proc Natl Acad Sci USA 21, 11919-11924 (1996)).

Identification of the embCAB genes allowed for the analysis of the molecular basis of resistance of mycobacteria to EMB. For this purpose, the sequence of resistant M. smegmatis mutants was compared with that of the parenteral strain, as well as the effect of mutant alleles on the phenotype of recombinant strains. Sequence analysis of pIMM99 and mutant strains of M. smegmatis identified an isoleucine to phenylalanine mutation in the embB of intermediate and high level resistant mutants which was not present in the parenteral susceptible strain. This substitution involves a residue (Ile303, M. tuberculosis coordinates) located in a loop of predicted cytoplasmic orientation (FIG. 1C) which is conserved among different mycobacterial Emb proteins (FIG. 2). In M. tuberculosis, mutations associated with resistance involved Met306, a conserved methionine codon. Mutation of M. tuberculosis Met306 to isoleucine or valine was identified in 12 of 278 (44.4%) EMB-resistant, unrelated, clinical isolates, but in none of 43 susceptible strains. Another interesting observation is that M. leprae exhibited a glutamine at the conserved Ile303 embB position, while maintaining a isoleucine residue at the homologous embC position. As EMB is not active against this organism (Dr. Pannikar, WHO, personal communication), this finding could represent the molecular basis for the intrinsic EMB resistance of M. leprae.

Evidence demonstrating that mutations in embB did confer EMB resistance was further supported with gene transfer experiments. Transformation of susceptible M. smegmatis with a multicopy plasmid carrying the mutant embCAB genes (embB2 allele) resulted in a more than 500-fold increase in EMB MIC (MIC>240 μg/ml). In contrast, transformation with the wild type allele from M. tuberculosis or M. smegmatis resulted in 2- to 20-fold increase in MIC values (MIC=1 and 10 μg/ml, respectively). Single-copy integration into the M. smegmatis chromosome of the 40 kb fragment from the high level resistant mutant resulted in a non-significant 2-fold increase in resistance to EMB (MIC=1.0 μg/ml). This result suggests dominance of the wild type to the embB2 allele in a merodiploid state, and indicates that the additional mutations required to achieve high level resistance, likely of regulatory nature, are not present within the region.

These data confirm the association of a specific mutation to a resistance phenotype and indicate that the emb proteins are a probable target of the drug. Though the actual mechanism of action of the drug remains to be determined, it can be speculated that the mutations modify a glycosyltransferase active site to which EMB, proposed to act as a arabinose analogue (Maddry, J. A., et al. Res Microbiol 147, 106-112 (1996)), binds. The complexity of the biological phenomenon of EMB resistance is however underscored by (i) the observation that multiple steps are necessary to achieve high level resistance, and (ii) the yet unmapped location of EMB-resistance mutations in the first- and last-step laboratory resistance mutants of M. smegmatis, and in a proportion of M. tuberculosis isolates. Thus, additional genes should be operative in the development of resistance, a notion consistent with the observed pleiotropic action of the drug (Deng, L., et al., Antimicrob Agents Chemother 39, 694-701 (1995)).

The data in the present manuscript add to the understanding of the mechanism of action and resistance to antituberculous drugs (Musser, J. M., Clin Microbiol Rev 8, 496-514 (1995)). To date, 11 mycobacterial genes have been identified associated with drug resistance: katG, mabA/inhA, and ahpC (isoniazid) (Zhang, Y., et al. Nature 358, 501-593 (1992); Banerjee, A. et al. Science 263, 227-230 (1994); Deretic, V., et al. Mol Microbiol 17, 889-900 (1995); Wilson, T. M. & Collins, D. M., Mol Microbiol 19, 1025-1034 (1996); Telenti, A., et al. J. Clin Microbiol in press, (1996)), rpoB (rifampin) (Telenti, A. et al. Lancet 341, 647-650 (1993)), rrs and rpsL streptomycin) (Finken, M., et al. Mol Microbiol 9, 1239-1246 (1993)), (Honore, N. & Cole, S. T. Antimicrob Agents Chemother 38, 238-242 (1994)), gyrA, gyrB and IfrA (fluoroquinolones) ((Takiff, H. E. et al. Antimicrob Agents Chemother 38, 773-780 (1994); Kocagoz, T., et al. Antimicrob Agents Chemother 40, 1768-1774 (1996); Takiff, H. E., et al. Proc Natl Acad Sci USA 93, 362-366 (1996)), and pncA (pyrazinamide) (Scorpio, A. & Zhang, Y. Nature Med 2, 662-667 (1996)). This information allowed the development of novel strategies for detection of drug resistance, reducing the length of testing from weeks to days (Telenti, A. & Persing, D. H. Res. Microbiol 147, 73-79 (1996)). Alternative means of testing would prove particularly useful for EMB, as susceptibility testing to this drug remains suboptimally standardized (Lazlo, A., et al. Quality assurance programme for Drug susceptibility testing of Mycobacterium tuberculosis in the WHO/IUTLD supranational laboratory network: first round of proficiency testing. Journal of the International Union against Tuberculosis and Lung Disease in press, (1997)), due to the poor stability of the agent in growth media (Gangadharam, P. R. & Gonzalez, E. R. Am Rev Resp Dis 102, 653-655 (1970)), and to discrepancies encountered when performing the analysis in solid versus liquid media (Heifets, L. B., et al. Antimicrob Agents Chemother 30, 927-932 (1986)). In the current study, automated sequencing and automated SSCP was used for detection of embB mutations. Both techniques unambiguously identified the prevalent Met306 mutation in clinical isolates of EMB-resistant M. tuberculosis.

In conclusion, the selective and broad-spectrum activity of EMB against mycobacteria correlates with the identification of a unique and conserved operon structure which appears to encode its drug target. Identification of mutations opens the possibility to implement genotype-based tests for the rapid detection of EMB resistance in M. tuberculosis. Mapping of additional mutations associated with resistance will guide the efforts to define the mechanism of action of EMB and to identify discrete active domains in the Emb proteins, the putative mycobacterial arabinosyl transferases.

The Role of embB Mutations

I. Materials and Methods

A. Bacterial Strains

The study of the role of embB mutations was based on a sample of 118 M. tuberculosis isolates recovered in diverse localities in the United States, Europe, Yemen, Philippines, Japan, and India. The sample includes 85 EMB-resistant and 33 EMB-susceptible organisms. The bacteria were recovered from diseased patients with a variety of distinct clinical manifestations, and included both pulmonary and extrapulmonary sources.

B. IS6110 Genotyping and Genetic Group Assignment

Strain growth and DNA isolation were performed in laboratories equipped with biosafety level 3 facilities. Isolation of chromosomal DNA and IS61100 typing were performed by previously described methods (van Embden, et al., J. Clin. Microbiol. 31:406-409 (1993)). Recent data have demonstrated that all M. tuberculosis isolates can be assigned to one of three genetic groups based on polymorphisms present at codon 463 of the gene (katG) encoding catalase-peroxide, and codon 95 of the gene (gyrA) encoding the A subunit of DNA gyrase (Srinand, et al., Molecular Population Genetics of katG codon 463 and gyrA codon 95 polymorphisms in the Mycobacterium tuberculosis complex (MTC), abstr. U119, p. 122. In Abstracts of the General Meeting of the American Society for Microbiology, Washington, D.C.). The group designations used are as follows: group 1, katG463 CTG (Leu) plus gyrA95 ACC (Thr); group 2, katG463 CGG (Arg) plus gyrA95 ACC (Thr); and group 3, katG463 CGG (Arg) plus gyrA95 AGC (Thr).

C. Ethambutol Susceptibility Testine

Routine EMB susceptibility testing was conducted by either the BACTEC radiorespiratory method (7.5 μg/ml) or agar diffusion with Middlebrook 7H10 medium (5 μg/ml). To determine EMB MICs, susceptibility testing was performed by agar diffusion as described previously (National Committee for Clinical Laboratory Studies, 1990. Antimicrobial Susceptibility Testing. Proposed Standard M24-P. National Committee for Clinical Laboratory Standards, Villanova, Pa.). EMB (Sigma Chemical Co., St. Louis, Mo.) was incorporated into 7H10 agar (Difco, Detroit, Mich.) at the following concentrations: 0, 0.5, 1, 2.5, 5, 10, 20, and 40 μg/ml. Frozen cultures of M. tuberculosis stored at -70° C. in 7H9 broth in 15% glycerol were recovered on Lowenstain-Jensen slants. Each isolate was then subcultured on 7H10 agar, suspended in 7H9/ADC broth (Difco) in the presence of glass beads to a concentration of 10⁷ CFU/ml, diluted 1:100, and plated in duplicate (0.1 ml per quadrant). The plates were incubated at 35° C. in the presence of 5% CO₂. Each plate was checked weekly, and the results were recorded between weeks 3 and 4.

D. Automated DNA Sequencing of the embCAB Locus

The 7.5 kb region of embCAB was sequenced in various stages using a total of 30 primers. A GeneAmp System 9600 thermocycler (Perkin Elmer Corp., Foster City, Calif.) was used for all target amplifications, with the following parameters: annealing temperature at 65° C. for 39 sec, extension at 72° C. for 82 sec, and a denaturation step at 94° C. for 54 sec for a total of 30 cycles. Each reaction was preceded by an initial denaturation step at 94° C. for 60 sec, and terminated with a final extension step at 72° C. for 5 minutes. DNA sequencing reactions were performed with the Taq Dye Deoxy terminator cycle sequencing kit using Amplitaq DNA polymerase FS (Applied Biosystems, Inc., Foster City, Calif.). The forward, reverse, and internal primers were used for sequencing. Sequence data generated with an ABI 377 automated instrument were assembled and edited electronically with EDITSEQ. ALIGN, and MEGALIGN programs (DNASTAR, Madison, Wis.), and compared with a published sequence (Telenti, et al., Nature Med. (in press, 1997)) and a sequence generated from M. tuberculosis strain H37Rv for the entire embCAB locus.

Nineteen M. tuberculosis (16 EMB-resistant and 3 EMB-susceptible) isolates were analyzed for the entire 7.5 kb region except a 75-bp segment located at the beginning of embB. This region is characterized by the presence of a stem loop structure (Telenti, et al., Nature Med. (in press, 1997)) and sequence data could not be reproducibly obtained for all isolates. The 5' region (1,892 bp) of embB was analyzed in an additional 69 EMB-resistant and 30-susceptible M. tuberculosis isolates from diverse geographic localities, and representing 70 IS6110 fingerprints.

II. Results

A. Alteration in embCAB Locus among 19 Strains Sequenced in Entirety

Analysis of 16 EMB-resistant and 3-susceptible isolates identified a total of four polymorphic nucleotide sites, each located in different regions of the three-gene locus. For example, a polymorphism (G>C) was identified at nucleotide 195 of embC in two EMB resistant and one EMB susceptible isolate. This change would produce a Val→Leu substitution in embC. These same three isolates also had a G→A synonymous substitution at nucleotide 2898 of embA. One resistant isolate also had a missense mutation in codon 5 of embA (GGT→AGT) resulting in G→S alteration. Interestingly, eight EMB-resistant strains had three distinct missense mutations in codon 306 of embB (ATG, Met) resulting in substitution with Ile (ATA and ATC) or Val (GTG). Hence, the sequence data for the entire 7.5 kb embCAB locus suggested that mutations associated with resistance were overrepresented in the 5' portion of embB.

B. Variation in the 5' Segment of embB among 69 EMB-resistant and 30 EMB-susceptible Isolates.

Because the results obtained from sequencing the entire 7.5 kb embCAB region suggested that mutations in codon 306 of embB were important in EMB resistance, a 1,892 bp embB region was sequenced in 69 EMB-resistant and 30 EMB-susceptible organisms. This sample included 51 EMB-resistant and 25 EMB-susceptible strains judged to be epidemiologically unassociated because they had distinct IS6110 patterns. All 30 EMB-susceptible organisms had the same wild type sequence. In striking contrast, the 69 EMB-resistant strains had eight distinct missense mutations in this 1,892-bp region. These mutations occurred in individual codons, including 285 (TTC→TTA, Phe→Leu), 330 (TTC→GTC, Phe→Val), and 630 (ACC→ATC, Thr→Ile). Five distinct missense mutations were identified in codon 306 (ATG): GTG (Val), CTG (Leu), ATA (Ile), ATC (Ile), and ATT (Ile). (See FIG. 7). When only epidemiologically unassociated organisms are considered, 69% of EMB-resistant bacteria had an amino acid change in the region of embb studied, and most replacements (89%) occurred at position 306.

The analysis included two pairs of EMB-resistant and susceptible isolates with two different IS6110 fingerprints. The resistant isolates had either Met306Ile or Met306Val alterations, whereas the paired susceptible isolates had the identical wild type embB sequence.

C. Correlation of embB Mutation with EMB MICs

The relationship of embB mutations with level of resistance to EMB were next studied. MICs were determined for resistant organisms by plating on 7H10 agar containing EMB in a range on concentrations between 0 and 40 μg/ml. Because of this relatively large number of strains characterized for the embB sequence, not all organisms were analyzed for EMB MIC. Rather, a representative subset of strains was studied that together contain a broad array of embB mutation and IS6100 subtypes. A summary of the MICs for the genetically characterized M. tuberculosis isolates is presented in Table 1. All isolates with the embB position 306 substitutions had EMB MICs of 20 or 40 μg/ml. Interestingly, regardless of genetic background as indexed by IS6110 typing, and katG463/gyrA95 group, bacteria with a Met306Ile replacement had an MIC of 20 μg/ml, whereas organisms with a Met306Leu or Met306Val change had an MIC of 40 μg/ml. In addition, one organism with a Phe330Val change had an MIC of 40, and two epidemiologically related strains with a Thr630Ile change had an EMB MIC exceeding 40 μg/ml.

III. Discussion

The recent demonstration that mutations in the M. tuberculosis embCAB locus are associated with EMB resistance prompted a more thorough examination of sequence variation at this locus in genetically distinct susceptible and resistant organisms. In particular, the aim of the study was to determine if polymorphisms were uniquely present in EMB-resistant versus -susceptible M. tuberculosis isolates. The results documented two findings about sequence variation in this 7.5 kb region. First, there is restricted variation in embCAB in natural populations of M. tuberculosis recovered from diverse geographic sources. This result confirms data generated in the course of study of several antimicrobial agent target genes including katg, rpoB, strA, rrs, and others. The embCAB data are also consistent with the hypothesis that M. tuberculosis is evolutionarily new, perhaps having arisen as recently as 15,000-20,000 years ago. Second, the results document the unique and common association of embB amino acid residue 306 substitutions with EMB resistance. The only reasonable hypothesis to explain the occurrence of five distinct mutant codons resulting in three different amino acid replacements at embB position 306 of EMB resistant organisms is that they have arisen by positive Darwinian selection in the course of drug therapy. Hence, it is likely that these amino acid substitutions mediate EMB resistance, rather than are simply surrogate markers for drug resistant organisms.

The region of embB containing residue Met 306 is highly conserved in M. tuberculosis, M. avium, M. leprae, and M. smegmatis. Moreover, based on amino acid sequence alignments, a Met residue would be present at position 306 in these four species. Biochemical studies and sequence alignments have suggested that embB is a glycosyltransferase. Telenti, et al. (Nature Med. (in press, 1997)) hypothesized that embB is an integral membrane protein with 12 transmembrane domains and a C-terminal globular region of about 400 amino acids with predicted location in the periplasm. The data herein is fully consistent with the idea that these mutations detrimentally affect a glycosyltransferase biding site to where EMB, proposed (Maddry, et al., Res. Microbiol. 147:106-112 (1996)) to be an arabinose analog, binds. The identification of a range of embB variants provides an important resource for biochemical and structural studies designed to more fully investigate the molecular mechanisms of EMB action.

The data generated from the EMB susceptibility testing suggest the existence of a non-random association between certain amino acid substitutions at embB position 306 and level of resistance to EMB. Analogous data have been presented for amino acid substitutions in the fluoroquinolone resistance-determining region of the A subunit of DNA gyrase (Xu, et al., J. Infect. Dis., 174:1127-1130 (1996)), changes in the beta subunit of RNA polymerase that confer resistance to rifampin and related drugs, and mutations conferring streptomycin resistance. The association of particular amino acid substitutions with EMB MIC level implies that position 306 contains important structure-function information.

All publications mentioned hereinabove are hereby incorporated by reference in their entirety.

While the foregoing invention has been described in detail for purpose of clarity and understanding, it will be appreciated by one skilled in the art from a reading of the disclosure that various changes in form and detail can be made without departing from the true scope of the invention in the appended claims.

    __________________________________________________________________________     #             SEQUENCE LISTING                                                 - (1) GENERAL INFORMATION:                                                     -    (iii) NUMBER OF SEQUENCES:  57                                            - (2) INFORMATION FOR SEQ ID NO:1:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 1:                            - Ile Ala Asp Gly Gly Val Leu Ala Thr Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:2:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 2:                            - Leu Leu Trp His Ile Ile Gly Ala Thr Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:3:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 3:                            - Ser Asp Asp Gly Tyr Asn Leu Thr Val Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:4:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 4:                            - Arg Val Ser Ser Glu Ala Gly Tyr Leu Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:5:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 5:                            - Ile Ala Asp Ala Ala Val Leu Ala Thr Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:6:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 6:                            - Leu Leu Trp His Val Val Gly Ala Thr Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:7:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 7:                            - Ser Asp Asp Gly Tyr Asn Leu Thr Ile Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:8:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 8:                            - Arg Val Ala Pro Lys Ala Gly Tyr Leu Val                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:9:                                             -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 9:                            - Leu Ala Asp Ala Ala Val Ile Ala Thr Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:10:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 10:                           - Leu Leu Trp His Val Ile Gly Ala Thr Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:11:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 11:                           - Ser Asp Asp Gly Tyr Leu Leu Thr Val Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:12:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 12:                           - Arg Val Ala Pro Lys Ala Gly Tyr Val Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:13:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 13:                           - Ile Thr Asp Thr Gly Val Ile Gly Gly Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:14:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 14:                           - Leu Ile Trp His Ile Val Gly Ala Pro Thr                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:15:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 15:                           - Ser Asp Asp Gly Tyr Asn Met Thr Ile Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:16:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 16:                           - Arg Val Ala Ser Glu Ala Gly Tyr Thr Thr                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:17:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 17:                           - Cys Leu Asp Gly Leu Val Ile Thr Ile Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:18:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 18:                           - Ala Trp Trp His Phe Val Gly Ala Asn Thr                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:19:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 19:                           - Ser Asp Asp Gly Tyr Ile Leu Thr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:20:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 20:                           - Arg Val Ser Glu His Ala Gly Tyr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:21:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 21:                           - Gly Leu Asp Thr Leu Val Ile Ala Val Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:22:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 22:                           - Val Trp Trp His Phe Val Gly Ala Asn Thr                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:23:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 23:                           - Ser Asp Asp Gly Tyr Ile Leu Thr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:24:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 24:                           - Arg Val Ser Glu His Ala Gly Tyr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:25:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 25:                           - Pro Leu Asp Gly Leu Val Ser Ala Met Leu                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:26:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 26:                           - Val Trp Trp His Phe Val Gly Ala Asn Thr                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:27:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 27:                           - Ala Asp Asp Gly Tyr Ile Leu Thr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:28:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 28:                           - Arg Val Ser Glu His Ala Gly Tyr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:29:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 29:                           - Leu Val Asp Val Ala Val Ile Phe Gly Phe                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:30:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 30:                           - Leu Leu Trp His Val Ile Gly Ala Asn Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:31:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 31:                           - Ser Asp Asp Gly Tyr Gln Met Gln Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:32:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 32:                           - Arg Thr Ala Asp His Ser Gly Tyr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:33:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 33:                           - Leu Ala Asp Val Ala Val Ile Phe Gly Phe                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:34:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 34:                           - Val Leu Trp His Val Ile Gly Ala Asn Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:35:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 35:                           - Ser Asp Asp Gly Tyr Ile Leu Gly Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:36:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 36:                           - Arg Val Ala Asp Arg Ala Gly Tyr Met Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:37:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 37:                           - Leu Thr Asp Ala Val Val Ile Phe Gly Phe                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:38:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 38:                           - Leu Leu Trp His Val Ile Gly Ala Asn Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:39:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 39:                           - Ser Asp Asp Gly Tyr Ile Leu Gly Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:40:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 40:                           - Arg Val Ala Asp His Ala Gly Tyr Met Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:41:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 41:                           - Ala Val Asp Gly Val Val Val Gly Gly Met                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:42:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 42:                           - Ala Ile Trp Tyr Val Ile Gly Ala Asn Ser                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:43:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 43:                           - Ser Asp Asp Gly Tyr Ile Leu Gln Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:44:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10 amin - #o acids                                                (B) TYPE:  amino aci - #d                                                      (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE:                                                                (A) DESCRIPTION:  pepti - #de                                        -    (iii) HYPOTHETICAL:  NO                                                   -      (v) FRAGMENT TYPE: internal fragment                                    -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 44:                           - Arg Thr Ala Glu His Ala Gly Tyr Met Ala                                      #                10                                                            - (2) INFORMATION FOR SEQ ID NO:45:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  10095                                                             (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  doub - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: Genomic DNA                                          -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 45:                           #              50ACGTCT ACCCCAACCA GCCCAATGTT CGCCGCTACA                       #             100ACCGCC CTCTTCGCCG ACCCGCGTTT CGTCGTCGAG                       #             150CGTGCT GGCCATCCGC AAGCCGCAGG AGAGCGCGTG                       #             200CGCCCC ACCCCGTATC GCCGTCCGGC TACCATCTAC                       #             250CGGGAG CAAACTACCG GATCGCCCGG TACGTCGCTG                       #             300CTAGGC GCTGTGCTGG CCATCGCCAC CCCACTGCTG                       #             350CACCGC GCAATTGAAC TGGCCCCAAA ACGGCACGTT                       #             400CACCGC TGATTGGCTA CGTGGCCACC GACTTGAACA                       #             450CAGGCC GCCGCCGGAC TGGCCGGATC GCAGAACACC                       #             500GTTGTC AACGGTGCCC AAGCAGGCGC CTAAGGCCGT                       #             550TGCTGC AACGGGCCAA CGACGACCTG GTGCTTGTGG                       #             600TTGGTC ACCGCCCCGC TGAGTCAGGT GCTCGGCCCG                       #             650GACATT CACCGCGCAC GCCGATCGGG TCGCCGCCGA                       #             700TGCAGG GACCCAATGC TGAGCACCCC GGTGCACCGC                       #             750AGCGGC TACGACTTCC GCCCGCAGAT CGTCGGGGTG                       #             800CGGGCC GGCGCCACCG GGTCTGAGCT TCTCGGCGAG                       #             850ACAGCA GCAGCCCCAC GCCGCTGAAG ATGGCCGCCA                       #             900GCGCTC ACCGGCGCCG CCCTGGTGGC GCTGCACATC                       #             950CGGCAT GCGGCACCGG CGGTTCCTGC CCGCGCGCTG                       #            1000GTCTGG ACACCCTGGT TATCGCCGTG CTGGTGTGGT                       #            1050GCCAAC ACCTCCGACG ACGGCTACAT CCTGACCATG                       #            1100GCATGC GGGCTATATG GCCAACTACT ACCGCTGGTT                       #            1150CGCCTT TCGGCTGGTA CTACGACCTG CTGGCGCTGT                       #            1200ACGGCC AGTATCTGGA TGCGCCTACC CACCCTGGCG                       #            1250CTGGTG GGTAATCAGC CGTGAGGTCA TTCCCCGGCT                       #            1300AGACGA GCCGGGCAGC GGCGTGGACG GCGGCGGGCA                       #            1350TGGCTG CCGCTGGACA ACGGCCTTCG GCCCGAGCCG                       #            1400CATCCT GCTGACCTGG TGCTCGGTGG AGCGGGCGGT                       #            1450TGCTGC CGGTGGCAAT CGCCTGCATC ATCGGTGCCT                       #            1500GGGCCG ACGGGCATCG CCTCGATCGG TGCGCTGCTG                       #            1550GCTACG GACCATCCTG CACCGGCGTT CCAGGCGGTT                       #            1600TGGTGG CGCCGATCCT GGCCGCGGCC ACCGTCACCG                       #            1650CGTGAT CAGACCTTCG CGGGCGAGAT CCAGGCCAAC                       #            1700CGTAGG GCCCAGCCTG AAGTGGTTCG ACGAACACAT                       #            1750TGTTCA TGGCCAGCCC CGACGGCTCG ATCGCCCGCC                       #            1800GCCTTG GTGCTGGCGC TCGCGGTATC GGTGGCAATG                       #            1850CCGCAT TCCAGGTACC GCTGCTGGAC CGAGCCGCCG                       #            1900CGATCA TTTCCTTCCT CGCGATGATG TTCACCCCGA                       #            1950CACTTC GGGGTGTTCG CGGGGTTGGC CGGGTCGCTG                       #            2000GGTCGC GGTGACGGGC GCTGCGATGC GCTCGCGGCG                       #            2050TCGCCG CCGTGGTGGT CTTCGTGTTG GCCCTGTCGT                       #            2100GGCTGG TGGTACGTGT CCAACTTCGG TGTGCCATGG                       #            2150GAAGTG GCGATGGTCG CTTACCACCG CACTCCTCGA                       #            2200TGCTGC TGCTAGCGGC ATGGTTCCAC TTCGTCGCCA                       #            2250CGAACA GCCAGGCCAA CCCGGTTTAG GGCACGACTA                       #            2300GTCCCC GTTGGCAATT GCCACGTGGT TGCTGGTGCT                       #            2350CGCTGA CCCAGGCGAT GATTTCCCAG TACCCGGCGT                       #            2400TCTAAC CTACAGGCTT TGGCCGGCAA GACCTGCGGG                       #            2450GCTGGT GGAGCTGGAT CCCAACGCAG GCATGCTGGC                       #            2500CGTTGG CCGACGCCCT GGGAGCCGGC CTGTCTGAAG                       #            2550GGCATT CCCGCCGACG TCACCGCCGA CCCGGTGATG                       #            2600TCGCAG TTTCCTCAAC GACGACGGGC TGATCACCGG                       #            2650CCGAAG GGGGCACCAC GGCCGCACCG GGAATCAACG                       #            2700CTGCCC TACAACCTGG ACCCGGCCCG TACACCGGTG                       #            2750AGCCGG CGTGCAGGTG CCCGCCATGC TGCGGTCGGG                       #            2800CCACCA ACGAGCAGCG GGACAGGGCG CCGCTGCTGG                       #            2850GGGCGA TTCGACTCCC GCGAGGTCCG GTTGCAGTGG                       #            2900AGCGGC CGCCGGACAC CACGGTGGGT CGATGGAATT                       #            2950CCGCGC CGGCCTGGCG CAACCTGCGC GCACCACTGT                       #            3000ACCGCC ACCCAGGTCC GGTTGGTCGC CGACGACCAG                       #            3050GCACTG GATCGCCCTC ACACCACCGC GGATTCCGCG                       #            3100AGAACG TGGTGGGCGC AGCGGATCCG GTGTTCCTGG                       #            3150CTGGCA TTCCCCTGCC AACGCCCGTT CGGCCACCAA                       #            3200GACACC CAAGTGGCGG ATCCTGCCGG ACCGGTTCGG                       #            3250CACCGG TGATGGATCA CAATGGCGGT GGCCCGCTGG                       #            3300CTGATG CGCGCAACCA CGGTGGCCAG CTACCTCAAA                       #            3350GGACTG GGGCGCGTTA CAGCGGTTGA CGCCTTACTA                       #            3400CCGCTG ATCTGAACCT AGGAACGGTG ACTCGCAGCG                       #            3450GCGCCG TTGCGCCGCG GCTAGAAGTG CCGTGGCCAC                       #            3500CCTCCG CGGCCCCGCA TCCTCACCGC CCTTAACCGC                       #            3550AGCCTC GTGCCCCACG ACGGTAATGA GCGATCTCAC                       #            3600AGCAGC CGTCGTCTCG GGAATCGCGG GTCTGCTGCT                       #            3650CGCTGC TTCCGGTGAA CCAAACCACC GCGACCATCT                       #            3700AGCACC GCCGACGGCA ACATCACCCA GATCACCGCC                       #            3750GGCGCC ACGCGCGCTG GACATCTCGA TCCCCTGCTC                       #            3800TGCCCG CCAACGGCGG CCTGGTGCTG TCCACACTGC                       #            3850GATACC GGTAAGGCCG GGCTGTTCGT CCGCGCCAAC                       #            3900GGCGTT CCGCGACTCG GTGGCCGCGG TGGCGGCCCG                       #            3950CGGGAG GCTGTAGCGC GCTGCATATC TGGGCCGATA                       #            4000GCTGAT TTTATGGGTA TACCCGGCGG CGCCGGGACC                       #            4050GAAGCC ACAGGTTGGC GGCATCTTCA CCGACCTGAA                       #            4100CCGGGC TGTCGGCCCG CGTCGACATC GACACTCGGT                       #            4150GGCGCG CTCAAGAAGG CCGTGATGCT CCTCGGCGTG                       #            4200AGCCAT GGTGGGGCTG GCCGCGCTGG ACCGGCTCAG                       #            4250TGCGCG ACTGGCTGAC CCGATATCGC CCGCGGGTGC                       #            4300AGCCGG CTCGCTGACG CAGCGGTGAT CGCGACCTTG                       #            4350CATCGG CGCCACCTCG TCCGATGACG GCTACCTTCT                       #            4400TCGCCC CGAAGGCCGG CTATGTAGCC AACTACTACC                       #            4450ACGGAG GCGCCGTTCG ACTGGTATAC ATCGGTGCTT                       #            4500GGTGAG CACCGCCGGC GTCTGGATGC GCCTGCCCGC                       #            4550TCGCCT GCTGGCTGAT CGTCAGCCGT TTCGTGCTGC                       #            4600GGCCCG GGCGGGCTGG CGTCCAACCG GGTCGCTGTG                       #            4650GGTGTT CCTGTCCGCC TGGCTGCCGT TCAACAACGG                       #            4700CGCTGA TCGCGCTGGG TGTGCTGGTC ACGTGGGTGT                       #            4750ATCGCG CTCGGACGGC TGGCCCCGGC CGCGGTAGCC                       #            4800GCTTAC CGCGACGCTG GCACCGCAGG GGTTGATCGC                       #            4850TGACTG GTGCGCGCGC CATCGCCCAG AGGATCCGGC                       #            4900GATGGA CTGCTGGCGC CGCTGGCGGT GCTGGCCGCG                       #            4950CACCGT GGTGGTGTTT CGGGACCAGA CGCTGGCCAC                       #            5000CACGCA TCAAGTACAA GGTCGGCCCG ACCATCGCCT                       #            5050CTGCGC TACTACTTCC TTACCGTGGA GAGCAACGTT                       #            5100CCGCCG GTTCGCGGTG CTGGTGTTGC TGTTCTGCCT                       #            5150TCGTGC TGCTGCGGCG CGGCCGGGTG GCGGGGCTGG                       #            5200TGGCGA CTGATCGGCA CTACGGCGGT CGGCCTGCTG                       #            5250GCCAAC CAAGTGGGCC GTGCAGTTCG GCGCATTCGC                       #            5300TGTTGG GTGCGGTCAC CGCGTTCACC TTTGCCCGCA                       #            5350CGACGC AACCTCACGC TGTACGTGAC CGCGTTGCTG                       #            5400GGCAAC CTCGGGCATC AACGGGTGGT TCTACGTCGG                       #            5450CGTGGT ATGACATCCA GCCCGTCATC GCCAGCCACC                       #            5500TTTCTG ACGCTGTCGA TCCTCACCGG ATTGCTGGCA                       #            5550CCGGAT GGACTACGCC GGGCACACCG AAGTCAAAGA                       #            5600GCATCT TGGCCTCTAC GCCACTGCTG GTGGTCGCGG                       #            5650GGCGAA GTCGGCTCGA TGGCCAAGGC CGCGGTGTTC                       #            5700CACCAC CGCCAAGGCC AACCTGACCG CGCTCAGCAC                       #            5750GTGCGA TGGCCGACGA CGTGCTGGCC GAGCCCGACC                       #            5800CTGCAA CCGGTTCCGG GCCAGGCGTT CGGACCGGAC                       #            5850TATCAG TCCCGTCGGC TTCAAACCCG AGGGCGTGGG                       #            5900CCGACC CGGTGGTCTC CAAACCCGGG CTGGTCAACT                       #            5950AACAAA CCCAACGCCG CCATCACCGA CTCCGCGGGC                       #            6000GGGCCC GGTCGGGATC AACGGGTCGC ACGCGGCGCT                       #            6050ACCCGG CACGTACCCC GGTGATGGGC AGCTACGGGG                       #            6100GCCACG GCCACCTCGG CCTGGTACCA GTTACCGCCC                       #            6150GCCGCT GGTGGTGGTT TCCGCGGCCG GCGCCATCTG                       #            6200ACGGCG ATTTCATCTA CGGCCAGTCC CTGAAACTGC                       #            6250CGGCCG GACGGCCGCA TCCAGCCACT GGGGCAGGTA                       #            6300CGGACC GCAACCCGCG TGGCGCAATC TGCGGTTTCC                       #            6350CGCCGG AGGCCGACGT GGCGCGCATT GTCGCCTATG                       #            6400CCTGAG CAATGGTTCG CCTTCACCCC GCCCCGGGTT                       #            6450TCTGCA GCGGTTGATC GGGTCAGCGA CACCGGTGTT                       #            6500CCGCAG CCAACTTCCC CTGCCAGCGA CCGTTTTCCG                       #            6550GCCGAG CTTCCGCAGT ACCGGATCCT GCCGGACCAC                       #            6600GTCGTC GAACCTATGG CAGTCCAGCT CGACCGGCGG                       #            6650CCCAGG CGCTGCTGCG CACCTCGACG ATCGCCACGT                       #            6700TGGTAT CGCGACTGGG GATCGGTGGA GCAGTACCAC                       #            6750CGATCA GGCTCCAGAC GCCGTTGTCG AGGAGGGCGT                       #            6800GCTGGG GTCGGCCAGG ACCGATCAGG GCGCTGCCAT                       #            6850GCAGAC GCAAAAGCAC CCCAAATCGG GCGATTTTGG                       #            6900GCTCGC GGGACGCGCT GGGTGGCCAC CATCGCCGGG                       #            6950GTTGTC GGTGGCGACG CCGCTGCTGC CCGTCGTGCA                       #            7000TCGACT GGCCACAGCG GGGGCAACTG GGCAGCGTGA                       #            7050TCGCTG ACGCCGGTCG ACTTTACCGC CACCGTGCCG                       #            7100CGCCAT GCCACCCGCG GGCGGGGTGG TGCTGGGCAC                       #            7150GCAAGG ACGCCAATTT GCAGGCGTTG TTCGTCGTCG                       #            7200GTGGAC GTCACCGACC GCAACGTGGT GATCTTGTCC                       #            7250GGTGAC GTCCCCGCAG TGTCAACGCA TCGAGGTCAC                       #            7300GCACCT TCGCCAACTT CGTCGGGCTC AAGGACCCGT                       #            7350CGCAGC GGCTTCCCCG ACCCCAACCT GCGCCCGCAG                       #            7400CACCGA CCTGACCGGG CCCGCGCCGC CCGGGCTGGC                       #            7450TCGACA CCCGGTTCTC CACCCGGCCG ACCACGCTGA                       #            7500ATCGGG GCGATCGTGG CCACCGTCGT CGCACTGATC                       #            7550GGACCA GTTGGACGGG CGGGGCTCAA TTGCCCAGCT                       #            7600TCCGGC CTGCATCGTC GCCGGGCGGC ATGCGCCGGC                       #            7650TGGCGC ACCTTCACCC TGACCGACGC CGTGGTGATA                       #            7700CTGGCA TGTCATCGGC GCGAATTCGT CGGACGACGG                       #            7750TGGCCC GAGTCGCCGA CCACGCCGGC TACATGTCCA                       #            7800TTCGGC AGCCCGGAGG ATCCCTTCGG CTGGTATTAC                       #            7850GATGAC CCATGTCAGC GACGCCAGTC TGTGGATGCG                       #            7900CCGCCG GGCTAGTGTG CTGGCTGCTG CTGTCGCGTG                       #            7950CTCGGG CCGGCGGTGG AGGCCAGCAA ACCCGCCTAC                       #            8000GGTCTT GCTGACCGCG TGGATGCCGT TCAACAACGG                       #            8050GCATCA TCGCGCTCGG CTCGCTGGTC ACCTATGTGC                       #            8100ATGCGG TACAGCCGGC TCACACCGGC GGCGCTGGCC                       #            8150ATTCAC ACTGGGTGTG CAGCCCACCG GCCTGATCGC                       #            8200TGGCCG GCGGCCGCCC GATGCTGCGG ATCTTGGTGC                       #            8250GTCGGC ACGTTGCCGT TGGTGTCGCC GATGCTGGCC                       #            8300CCTGAC CGTGGTGTTC GCCGACCAGA CCCTGTCAAC                       #            8350CCAGGG TTCGCGCCAA AATCGGGCCG AGCCAGGCGT                       #            8400CTGCGT TACTACTACC TCATCCTGCC CACCGTCGAC                       #            8450GCGCTT CGGCTTTTTG ATCACCGCGC TATGCCTGTT                       #            8500TCATGT TGCGGCGCAA GCGAATTCCC AGCGTGGCCC                       #            8550CGGCTG ATGGGCGTCA TCTTCGGCAC CATGTTCTTC                       #            8600CACCAA GTGGGTGCAC CACTTCGGGC TGTTCGCCGC                       #            8650TGGCCG CGCTGACGAC GGTGTTGGTA TCCCCATCGG                       #            8700CGCAAC CGGATGGCGT TCCTGGCGGC GTTATTCTTC                       #            8750TTGGGC CACCACCAAC GGCTGGTGGT ATGTCTCCAG                       #            8800TCAACA GCGCGATGCC GAAGATCGAC GGGATCACAG                       #            8850TTCGCC CTGTTTGCGA TCGCCGCCGG CTATGCGGCC                       #            8900GCCCCG CGGCGCCGGC GAAGGGCGGC TGATCCGCGC                       #            8950CGGTAC CGATCGTGGC CGGTTTCATG GCGGCGGTGT                       #            9000GTGGCC GGGATCGTGC GACAGTACCC GACCTACTCC                       #            9050CGTGCG GGCGTTTGTC GGCGGCTGCG GACTGGCCGA                       #            9100AGCCTG ATACCAATGC GGGTTTCATG AAGCCGCTGG                       #            9150TCTTGG GGCCCCTTGG GCCCGCTGGG TGGAGTCAAC                       #            9200GCCCAA CGGCGTACCG GAACACACGG TGGCCGAGGC                       #            9250CCAACC AGCCCGGCAC CGACTACGAC TGGGATGCGC                       #            9300AGTCCT GGCATCAATG GTTCTACGGT GCCGCTGCCC                       #            9350CGCCCG GGTACCGTTG GCAGGCACCT ACACCACCGG                       #            9400GCACAC TCGTCTCGGC GTGGTATCTC CTGCCTAAGC                       #            9450CCGCTG GTCGTGGTGA CCGCCGCGGG CAAGATCGCC                       #            9500GCACGG GTACACCCCC GGGCAGACTG TGGTGCTCGA                       #            9550GACCCG GAGCGCTGGT ACCCGCCGGG CGGATGGTGC                       #            9600GGAGAG CAGCCCAAGG CGTGGCGCAA CCTGCGCTTC                       #            9650GCCCGC CGATGCCGTC GCGGTCCGGG TGGTGGCCGA                       #            9700CACCGG AGGACTGGAT CGCGGTGACC CCGCCGCGGG                       #            9750TCACTG CAGGAATATG TGGGCTCGAC GCAGCCGGTG                       #            9800GGTCGG TTTGGCCTTC CCGTGCCAGC AGCCGATGCT                       #            9850TCGCCG AAATCCCGAA GTTCCGCATC ACACCGGACT                       #            9900CTGGAC ACCGACACGT GGGAAGACGG CACTAACGGC                       #            9950CACCGA CCTGTTGCTG CGGGCCCACG TCATGGCCAC                       #           10000ACTGGG CCCGCGATTG GGGTTCCCTG CGCAAGTTCG                       #           10050GCCCCT CCCGCCCAGC TCGAGTTGGG CACCGCGACC                       #               10095CC GGGCAAGATC CGAATTGGTC CATAG                            - (2) INFORMATION FOR SEQ ID NO:46:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  9960                                                              (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  doub - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: Genomic DNA                                          -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 46:                           #              50CGGTTC GAGGTGTCCG ACCACGGTCC GTTCGTGCTC                       #             100CGGGGG AAAGCCGGAG ACCGATGGCC ACTGATATCC                       #             150CCCGTG ACCGGTCCGC ATGCAGCGGG TGGCAGCAAC                       #             200GCTCGT CGCGATCATC GCCGGACTTC TCGGCACGCT                       #             250CGCCGC TGCTGCCGGT CGAGCAGACC ACCGCCGAGC                       #             300AACGGC GTCTGGCAGA GCGTCGACGC GCCGCTGATC                       #             350CGACCT GACCGTCACC GTGCCGTGCC AGGCCGCCGC                       #             400CGGAGA ACCGCAACCG CAGCGTGCTG TTGTCGACGG                       #             450CCCAAG GCCATCGACC GCGGCCTGCT GATCGAACGC                       #             500CACGGT CATCGTGCGC AACACCCCGG TCGTCAGCGC                       #             550TGCTCA GCCCCGACTG CCGGTACCTG ACGTTCACCG                       #             600GTGACC GGTGAGTTCG TCGGCCTCAC GCAGGGTCCC                       #             650GGGCGA GGCGGTGCGC GGCGAGCGCA GCGGCTACGA                       #             700TCGTCG GTGTGTTCAC CGACCTGTCC GGCCCGGCGC                       #             750TTGTCG GCGACCATCG ACACGCGCTA CAGCACGTCG                       #             800ACTGCT CGCGATGATC GTGGGCGTCG CGATGACCGT                       #             850CGCTGC ACGTGCTGGA CTGCGCCGAC GGCCGGCGCC                       #             900CCGTCG CGCTGGTGGT CGATGACGCC GCTGGACGGG                       #             950GCTGGT GTGGTGGCAC TTCGTCGGCG CCAACACGGC                       #            1000TCCTGA CCATGGCCCG TGTGTCCGAG CACGCCGGCT                       #            1050TACCGC TGGTTCGGTA CGCCTGAGTC GCCGTTCGGC                       #            1100GCTGGC GTTGTGGGCG CACGTGTCGA CGGCCAGCGT                       #            1150CCACGC TGCTCATGGG TCTGGCCTGC TGGTGGGTGA                       #            1200ATCCCG CGCCTGGGCG CCGCCGCCAA GCACAGCCGC                       #            1250CGCCGC GGGCCTGTTC CTGGCGTTCT GGCTGCCGCT                       #            1300GCCCCG AGCCCATCAT CGCGCTGGGC ATCCTGCTGA                       #            1350GAGCGC GGCGTCGCGA CCAGCAGGCT GCTGCCGGTG                       #            1400CATCGG TGCACTCACG CTGTTCTCCG GCCCCACCGG                       #            1450GCGCCC TGCTGGTCGC CATCGGACCG CTGAAAACCA                       #            1500GTTTCA CGGTTCGGCT ATTGGGCACT GCTGGCGCCG                       #            1550CACCGT CACGATCTTC CTGATCTTCC GCGACCAGAC                       #            1600TGCAGG CCAGCAGCTT CAAGTCGGCC GTCGGCCCCA                       #            1650GACGAG CACATCCGCT ACTCACGCCT GTTCACCACA                       #            1700GGTGGC GCGGCGCTTC GCGGTGCTCA CGCTGCTGCT                       #            1750CGATCG CGATGACGCT GCGCAAGGGC CGCATCCCCG                       #            1800CCGAGC AGACGCATCA TCGGCATCAC GATCATCTCG                       #            1850GTTCAC CCCGACCAAG TGGACCCACC AATTCGGTGT                       #            1900CGGGGT GCCTCGGCGC CCTGGCCGCC GTCGCGGTCA                       #            1950AAGTCG CGGCGTAACC GCACGGTGTT CGGCGCGGCA                       #            2000GGCGCT GTCGTTCGCG ACGGTCAACG GCTGGTGGTA                       #            2050GTGTGC CCTGGTCGAA CTCGTTCCCC GAGTTCAAGT                       #            2100ATGCTG CTGGGCCTGT CGGTGCTCGC GCTGCTGGTC                       #            2150CTTCAG CGGGCGCGAC GTCTCGCCCG ACCGGCCGCA                       #            2200GCCTTC TGGTCGCCCC GCTCGCGGTC GCCACGTGGG                       #            2250GAGGTG GTCTCGCTGA CGCTGGGGAT GATCAACCAG                       #            2300GGTGGG CCGCTCCAAC CTCAACGCCC TGACCGGCAA                       #            2350CCAACG ACGTGCTGGT CGAGCAGAAC GCCAACGCGG                       #            2400ATCGGT GAGCCGGCCG GTCAGGCGCT CGGCGCCGTG                       #            2450CGGGCC GAACGGCATC CCCTCGGATG TCTCCGCGGA                       #            2500AGCCCG GCACGGACAA CTTCGCCGAC AGCGACTCCG                       #            2550ACCGAG GTCGGCACGG AAGGCGGCAC CACAGCTGCC                       #            2600ATCCCG CGCGCGCCTG CCGTACGGCC TGAACCCGGC                       #            2650TCGGTT CGTGGCGTTC GGGCACACAG CAGCCCGCGG                       #            2700TGGTAC CGGCTGCCCG ACCGCGACCA GGCGGGCCCG                       #            2750GGCCGC CGGTCGGTTC GACCAGGGCG AGGTCGAGGT                       #            2800ACGAGC AGGCCGCGGC CAACGAGCCG GGCGGCAGCA                       #            2850GTCGGC GCGGCCCCGG CCTGGCGCAA CCTGCGCGCC                       #            2900CCCGCC CGAGGCCACC CAGATCCGGC TGGTCGCCAG                       #            2950CACCCC AGCACTGGAT CGCCCTGACC CCGCCGCGCA                       #            3000ACGCTG CAGGAGGTCG TCGGATCGTC CGACCCGGTG                       #            3050CGTAGG CCTGGCGTTC CCGTGCCAGC GGCCGTTCGA                       #            3100TCGTCG AGGTGCCCAA GTGGCGCATC CTGCCGGACC                       #            3150GCCAAT TCGCCGGTCA TGGACTACCT GGGCGGCGGC                       #            3200CGAGCT GCTGCTGCGC CCGTCGTCGG TGCCGACCTA                       #            3250GGTACC GCGACTGGGG CTCGTTGCAG CGGCTGACGC                       #            3300GCCCAG CCGGCGCGCC TGGACCTCGG CACGGCCACG                       #            3350GAGCCC GGCGCCCCTG CGGCTGAGTT GAGCGGCTGA                       #            3400GATCAC GGTAGGGCCG ACGCGCGCCC GCATGGCCGA                       #            3450TCGCAT ACCATCGAGC CTCGTGCCGG GCGATGAACA                       #            3500CAGATG ACGCAGTGAC CGAACCGTCC CGCATCGCAC                       #            3550GTCGCC GGCATCGCGG GCGTGTTGTT GTGCGGTCTG                       #            3600GGTGGA GGAGACCACC GCGACCGTCC TGTGGCCGCA                       #            3650ACGGCA ACGTCACCGA ACTGACGGCG CCGCTGGTGG                       #            3700GCACTC GACGTCACGA TCCCGTGCCG CGCCGTGGCC                       #            3750CGGCGG CGTGGTGTTC TCGACGAACC CGGCAGGCGG                       #            3800GCAACG GCATGTTCAT CCGCGCCAAC GCCGACGTGG                       #            3850CGCGAC ACGGTCGCCG CGGTCGCACC GCGTGAGGCC                       #            3900GTGCAG TGAGATCCAC GTCTGGGCCG ACGTCAGCGC                       #            3950TCGCCG GTATCCCCGA CGCCAGCGGA ACCCTGCCCG                       #            4000CAGGTC TCGGGTGTCT TCACCGACCT CAAGGTGCCC                       #            4050GGCCGC GCGCATCGAC ATCGACACCC GCTTCATCAC                       #            4100TGAAGA CCGCCGTGAT GGTGCTCGGC CTCGCGTGCG                       #            4150GTCGCG CTGGCCCTGT TGGACCGCGG ATGGCGCAGG                       #            4200GCGCGG ACGCGCCGGG CTGTGGACGT GGATCACCGA                       #            4250GCGGCC TGCTCATCTG GCACATCGTC GGCGCGCCCA                       #            4300TACAAC ATGACCATCG CCCGGGTGGC GTCCGAGGCG                       #            4350CTACTA CCGCTACTTC GGCGCGTCCG AGGCCCCGTT                       #            4400GCGTGC TGTCGCACCT GGCCTCGATC AGCACCGCGG                       #            4450CTGCCC GCCACGGCGG CCGCTATCGC GACGTGGCTG                       #            4500CGTGCT GCCCCGCATC GGCAGGCGCG TCGCGGCCAA                       #            4550TCACCG CGGGTGCGAC GTTCCTGGCC GCGTGGCTGC                       #            4600TTGCGT CCCGAACCGC TGATCGCGTT CGCGGTGATC                       #            4650GGTGGA GAACTCCATC GGCACGCGGC GCCTGTGGCC                       #            4700TCGTCA TCGCGATGTT CTCCGTCACA CTCGCCCCGC                       #            4750CTGGCG CCGCTGCTGG TCGGCGCGCG CGCCATCGGC                       #            4800CCGCCG TGCGGCACCG GGATCCTGGC GTCCCTGCCC                       #            4850TCGCCG TGGTCTTCGT GATCATCTTC CGCGATCAGA                       #            4900GCCGAG TCGGTGCGCA TCAAGTACGT CGTGGGACCG                       #            4950CCAGGA ATTCCTGCGG TACTACTTCC TCACGGTCGA                       #            5000GATCCC TGACCCGCCG ATTCGCGGTG CTGGTGCTGC                       #            5050GGCCTC ATCATGGTGC TGCTGCGCCG CGGCCGGGTG                       #            5100CGGGCC GCTGTGGCGG CTGTGCGGAT CGACCGCGAT                       #            5150TGATCC TCACCCCCAC CAAGTGGGCG ATCCAGTTCG                       #            5200CTGGCC GGCGCCCTCG GTGGTGTGAC GGCATTCGCG                       #            5250CCTGCA CAGCCGACGC AACCTCGCGC TGTACGTCAC                       #            5300TCCTGG CGTGGGCCAC CTCGGGCCTC AACGGCTGGT                       #            5350TACGGC GTGCCGTGGT TCGACAAGCA GCCTGTGATC                       #            5400CACCAC GATCTTCCTG GTGCTCGCGA TCGTCGGCGG                       #            5450GGCTGC ACTTCCGCAT GGACTACGCG GGGCACACCG                       #            5500GGCAGA AACCGCGCGC TCGCCTCGAC GCCGCTGTTG                       #            5550CATGGT GGTGCTCGAA CTCGGCTCGA TGGTCAAGGC                       #            5600ACCCCG TCTACACCGT GGGCTCGGCC AACATCGCCG                       #            5650GGCGAC AGCTGTGCGA TGGCCGACGC CGTGCTGGTC                       #            5700CGAGGG CATGCTGCAA CCGGTTCCGG GCCAGCGGTT                       #            5750CGCTGG GCGGCGAGGA CCCCGTCGGC TTCACCCCCA                       #            5800CACCTC GAACCCGAGC CCGTCGGGAC CAACCCGGGC                       #            5850GGGGCC GGTCGACAAG CCCAACATCG GTATCGCCTA                       #            5900GCGGCG GCTACGCCCC CGAGGGCGTC AACGGGTCGC                       #            5950TTCGGC CTGGACCCGT CCCGCACCCC GGTGATGGGC                       #            6000CAAGCT GGCCGCCAAG GCCACGTCGG CCTGGTACCA                       #            6050CGCCGG ACCGCCCGCT GGTGACCGTC GCCGCGGCAG                       #            6100TACGAG GAAGACGGCT CGTTCAACTA CGGCCAGTCG                       #            6150GGGTGT GCACCGGCCC GACGGCACCT ACCAGGCGCT                       #            6200CCATCG ACATCTTCCA GCAGAAGGCG TGGCGCAACC                       #            6250GCGTGG GCGCCGCCGG AGGCCAACGT CGCGCGCATC                       #            6300CAACCT GTCCGAGGAC CAGTGGTGCG CGTTCACGCC                       #            6350TGCTGC AGACCGCGCA GCAGTTCCTC GGATCGCAGA                       #            6400GACATC GCCACGGCCG CGAACTTCCC GTGCCAGCGG                       #            6450GCTCGG TGTTGCCGAG TTGCCCGAGT ACCGCATCAT                       #            6500AGATGG TGGTGTCGTC CAACCAGTGG CAGTCCGCCG                       #            6550TTCCTG TTCATCCAGG CGCTGCTGAG GACCGAGGCG                       #            6600GCGTGA CGACTGGTAC CGCGACTGGG GCTCGATCGA                       #            6650TGGTAC CGCAGGAGCA GGCGCCCACA GCCGCCATCG                       #            6700CGAGTG TTCGGATGGA GTCGCGGCGG ACCGATCAGG                       #            6750GGCAAC ATGGATGAAG CCGTGAGCGG CAACATGGAT                       #            6800CGGCAA GGACGTGCGG ATCGCACGCT GGGTCGCCAC                       #            6850TCGGAT TCGTGCTCTC CGTGTCCATC CCGCTGCTGC                       #            6900ACGGCC ACGCTGAACT GGCCGCAGCA GGGCAGGCTC                       #            6950TCCGCT GATCTCGCAG GCCCCGTTGG AGCTGACCGC                       #            7000CGGTGG TGCGCGACCT GCCCCCCGAG GGCGGCCTGG                       #            7050CCCGCC GAGGGCCGCG ACGCCGCACT CAACGCGATG                       #            7100CGAGAC CCGCGTCGAC GTGATCGTGC GCAACGTCGT                       #            7150ACCGCG ACCGCGTCGC GGGACCTGAC TGCCAACGCA                       #            7200AACCTG GATGGCACCT ACGCCGATTT CGTCGGTCTC                       #            7250TGAGGA CGCGGGCAAG CTGCAGCGCA CCGGCTACCC                       #            7300GGCCCG CGATCGTCGG TGTGTTCACC GACCTCACCG                       #            7350GGACTG TCGGTGTCGG CGGAGATCGA CACGCGCTTC                       #            7400GGCGCT CAAGCTCGCG GCCATGCTGC TGGCGATCGT                       #            7450CGCTGC TCGCGCTGTG GCGCCTCGAC CGGCTCGACG                       #            7500CGCCTG ATCCCGACGC GCTGGCGCAC GGTCACCGCG                       #            7550GGTCGG CGGCATGGCG ATCTGGTACG TGATCGGCGC                       #            7600ACGGCT ACATCCTGCA GATGGCGCGC ACGGCCGAGC                       #            7650GCGAAC TACTTCCGCT GGTTCGGCAG CCCCGAGGAC                       #            7700CTACAA CGTGCTGGCG CTCATGACCA AGGTGAGCGA                       #            7750TCCGAT TGCCCGACTT GATCTGTGCC CTGATCTGCT                       #            7800CGTGAG GTGCTGCCGC GGCTGGGACC CGCGGTGGCC                       #            7850GATGTG GGCCGCGGGC CTGGTGCTGC TTGGTGCGTG                       #            7900ACGGCC TGCGCCCCGA GGGCCAGATC GCCACGGGCG                       #            7950GTCCTG ATCGAACGCG CCGTCACCTC GGGCCGGCTC                       #            8000GGCCAT CACGACGGCC GCGTTCACGC TCGGTATCCA                       #            8050TCGCCG TCGCCGCACT GCTGGCCGGT GGCCGTCCGA                       #            8100ATCCGC CGCCGTCGCC TCGACGGGAC CTGGCCGCTG                       #            8150GGCCGC GGGCACCGTG ATCCTGGCCG TGGTGTTCGC                       #            8200CAACGG TGCTGGAGGC CACCAGGATC CGCACCGCGA                       #            8250GAGTGG TGGACCGAGA AGCTGCGCTA CTACTACCTG                       #            8300CGACGG CGCGATCTCG CGGCGCGTGG CGTTCGTGTT                       #            8350TGTTCC CCTCGCTGTT CATGATGTTG CGGCGCAAGC                       #            8400GCACGC GGCCCGGCCT GGCGCCTAAT GGGCATCATC                       #            8450CTTCCT GATGTTCACG CCCACCAAGT GGACCCACCA                       #            8500CCGCGG TGGGCGGTGC GATGGCCGCG CTGGCGACCG                       #            8550ACGGTG CTGCGCTCGG CGCGCAACCG GATGGCGTTC                       #            8600GTTCGT GCTGGCGTTC TGCTTCGCCT CCACCAACGG                       #            8650CGAACT TCGGTGCGCC GTTCAACAAT TCGGTGCCCA                       #            8700CAGATC AGCGCGATCT TCTTCGCGCT GTCGGCGATC                       #            8750GTTCTG GTTGCACCTG ACGCGTCGCA CCGAATCCCG                       #            8800TGACCG CGGCGCCCAT CCCCGTCGCG GCCGGGTTCA                       #            8850ATGGCG TCCATGGCGA TCGGCGTGGT GCGCCAGTAC                       #            8900CGGGTG GGCCAACATC CGCGCGTTCG CGGGCGGTTG                       #            8950ACGTTC TGGTGGAACC GGATTCGAAC GCGGGCTTCC                       #            9000GGCGCG TACGGTCCGC TTGGCCCGCT GGGCGGCGAG                       #            9050CTCCCC CGACGGTGTT CCCGACCGCA TCATCGCCGA                       #            9100ACAATC CGCAGCCGGG CACCGATTAC GACTGGAACC                       #            9150GACGAG CCGGGCATCA ACGGTTCCAC CGTGCCGCTG                       #            9200CCCGAA GCGGGTTCCG GTCGCGGGTA CGTACTCCAC                       #            9250AGAGCA GGCTGTCCTC GGCGTGGTAC GAGCTTCCAG                       #            9300GAACGG GCTGCGCATC CGCTGGTGGT CATCACCGCC                       #            9350CGGCGA GAGCGTCGCC AACGGCCTGA CGACCGGCCA                       #            9400AGTACG CGACCCGCGG CCCGGACGGC ACCCTGGTGC                       #            9450ACACCG TACGACGTGG GGCCCACCCC GTCGTGGCGC                       #            9500GCGCTC GGAGATCCCC GACGATGCCG TCGCGGTGCG                       #            9550ATCTGT CACTGAGCCA GGGCGACTGG ATCGCGGTGA                       #            9600CCCGAG CTGCAGTCGG TGCAGGAGTA CGTCGGCTCC                       #            9650GATGGA CTGGGCCGTG GGTCTGGCGT TCCCGTGCCA                       #            9700ACGCCA ACGGCGTCAC CGAGGTGCCC AAGTTCCGCA                       #            9750TACGCC AAGCTGCAGA GCACCGACAC GTGGCAGGAC                       #            9800CCTGCT GGGCATCACC GACCTGCTGC TGCGGGCCTC                       #            9850ACCTGT CGCAGGACTG GGGCCAGGAC TGGGGTTCGT                       #            9900ACCGTC GTCGAAGCGA CGCCTGCCGA ACTCGATTTC                       #            9950CAGCGG TCTCTACAGC CCGGGGCCTT TGCGCATCCG                       #      9960                                                                    - (2) INFORMATION FOR SEQ ID NO:47:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 47:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:48:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 48:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:49:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 49:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:50:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 50:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:51:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 51:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:52:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  12                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 52:                           #       12                                                                     - (2) INFORMATION FOR SEQ ID NO:53:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  20                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 53:                           # 20               GAGA                                                        - (2) INFORMATION FOR SEQ ID NO:54:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  20                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 54:                           # 20               CAGT                                                        - (2) INFORMATION FOR SEQ ID NO:55:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  20                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 55:                           # 20               TCGA                                                        - (2) INFORMATION FOR SEQ ID NO:56:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  20                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 56:                           # 20               CCTC                                                        - (2) INFORMATION FOR SEQ ID NO:57:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                     (A) LENGTH:  20                                                                (B) TYPE:  nucleic a - #cid                                                    (C) STRANDEDNESS:  sing - #le                                                  (D) TOPOLOGY:  linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                   -    (iii) HYPOTHETICAL:  NO                                                   -     (xi) SEQUENCE DESCRIPTION:  SEQ ID NO: - # 57:                           # 20               GACA                                                        __________________________________________________________________________ 

What is claimed is:
 1. A purified and isolated nucleic acid sequence of the embCAB operon from M. tuberculosis having the nucleotide sequence contained in FIG. 4 (SEQ ID NO:45).
 2. The nucleic acid of claim 1 having at least one mutation that results in ethambutol resistance.
 3. The nucleic acid of claim 2 wherein the mutation is selected from the group consisting of deletion, insertion, point, substitution, nonsense, missense, polymorphism, and rearrangement mutations.
 4. The nucleic acid of claim 3 wherein the mutation is a missense mutation.
 5. The nucleic acid of claim 4 wherein the missense mutation is located within SEQ ID NO:45, within the embB nucleic acid sequence of the embCAB operon.
 6. The nucleic acid of claim 4 wherein the missense mutation is located within SEQ ID NO:45, in the embA nucleic acid sequence of the embCAB operon.
 7. The nucleic acid of claim 3 wherein the mutation is a substitution mutation.
 8. The nucleic acid of claim 7 wherein the substitution mutation is located within SEQ ID NO:45, in the embA nucleic acid sequence of the embCAB operon.
 9. The nucleic acid of claim 3 wherein the mutation is a polymorphism.
 10. The nucleic acid of claim 9 wherein the polymorphism mutation is located within SEQ ID NO:45, in the embC nucleic acid sequence of the embCAB operon. 